108 research outputs found
On the complexity of minimum inference of regular sets
We prove results concerning the computational tractability of some problems related to determining minimum realizations of finite samples of regular sets by finite automata and regular expressions
Finding patterns common to a set of strings
AbstractAssume a finite alphabet of constant symbols and a disjoint infinite alphabet of variable symbols. A pattern is a non-null finite string of constant and variable symbols. The language of a pattern is all strings obtainable by substituting non-null strings of constant symbols for the variables of the pattern. A sample is a finite nonempty set of non-null strings of constant symbols. Given a sample S, a pattern p is descriptive of S provided the language of p contains S and does not properly contain the language of any other pattern that contains S. The computational problem of finding a pattern descriptive of a given sample is studied. The main result is a polynomial-time algorithm for the special case of patterns containing only one variable symbol (possibly occurring several times in the pattern). Several other results are proved concerning the class of languages generated by patterns and the problem of finding a descriptive pattern
Queries revisited
AbstractWe begin with a brief tutorial on the problem of learning a finite concept class over a finite domain using membership queries and/or equivalence queries. We then sketch general results on the number of queries needed to learn a class of concepts, focusing on the various notions of combinatorial dimension that have been employed, including the teaching dimension, the exclusion dimension, the extended teaching dimension, the fingerprint dimension, the sample exclusion dimension, the Vapnik–Chervonenkis dimension, the abstract identification dimension, and the general dimension
Inductive inference of formal languages from positive data
We consider inductive inference of formal languages, as defined by Gold (1967), in the case of positive data, i.e., when the examples of a given formal language are successive elements of some arbitrary enumeration of the elements of the language. We prove a theorem characterizing when an indexed family of nonempty recursive formal languages is inferrable from positive data. From this theorem we obtain other useful conditions for inference from positive data, and give several examples of their application. We give counterexamples to two variants of the characterizing condition, and investigate conditions for inference from positive data that avoids “overgeneralization.
Polynomial Identification of omega-Automata
We study identification in the limit using polynomial time and data for
models of omega-automata. On the negative side we show that non-deterministic
omega-automata (of types Buchi, coBuchi, Parity, Rabin, Street, or Muller)
cannot be polynomially learned in the limit. On the positive side we show that
the omega-language classes IB, IC, IP, IR, IS, and IM, which are defined by
deterministic Buchi, coBuchi, Parity, Rabin, Streett, and Muller acceptors that
are isomorphic to their right-congruence automata, are identifiable in the
limit using polynomial time and data.
We give polynomial time inclusion and equivalence algorithms for
deterministic Buchi, coBuchi, Parity, Rabin, Streett, and Muller acceptors,
which are used to show that the characteristic samples for IB, IC, IP, IR, IS,
and IM can be constructed in polynomial time.
We also provide polynomial time algorithms to test whether a given
deterministic automaton of type X (for X in {B, C, P, R, S, M})is in the class
IX (i.e. recognizes a language that has a deterministic automaton that is
isomorphic to its right congruence automaton).Comment: This is an extended version of a paper with the same name that
appeared in TACAS2
Experiments using semantics for learning language comprehension and production
Several questions in natural language learning may be addressed by studying formal language learning models. In this work we hope to contribute to a deeper understanding of the role of semantics in language acquisition. We propose a simple formal model of meaning and denotation using finite state transducers, and an algorithm that learns a meaning function from examples consisting of a situation and an utterance denoting something in the situation. We describe the results of testing this algorithm in a domain of geometric shapes and their properties and relations in several natural languages: Arabic, English, Greek, Hebrew, Hindi, Mandarin, Russian, Spanish, and Turkish. In addition, we explore how a learner who has learned to comprehend utterances might go about learning to produce them, and present experimental results for this task. One concrete goal of our formal model is to be able to give an account of interactions in which an adult provides a meaning-preserving and grammatically correct expansion of a child's incomplete utterance
- …